Sorting in Linear Time

Zhao Cong

Lower bounds for sorting

The decision-tree model

  • A decision tree is a full binary tree that represents the comparisons between elements that are performed by a particular sorting algorithm operating on an input of a given size.
  • each of the permutations on elements must appear as one of the leaves of the decision tree.
  • each of these leaves must be reachable from the root by a downward path corresponding to an actual execution of the comparison sort(We shall refer to such leaves as “reachable.”)

A lower bound for the worst case

  • The length of the longest simple path from the root of a decision tree to any of its reachable leaves represents the worst-case number of comparisons that the corresponding sorting algorithm performs
  • The worst-case number of comparisons for a given comparison sort algorithm equals the height of its decision tree.
  • Theorem 8.1:Any comparison sort algorithm requires .comparisons in the worst case. Proof:Consider a decision tree of height with reachable leaves corresponding to a comparison sort on elements So
  • Corollary 8.2:Heapsort and merge sort are asymptotically optimal comparison sorts.

Counting sort

  • Counting sort assumes that each of the input elements is an integer in the range to , for some integer . When , the sort runs in time.
  • In the code for counting sort, we assume that the input is an array , and thus . We require two other arrays: the array holds the sorted output, and the arrayprovides temporary working storage.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    COUNTING-SORT(A, B, k) 
    1 let C[0..k] be a new array
    2 for i = 0 to k
    3 C[i]=0
    4 for j =1 to A.length
    5 C[A[i]] = C[A[i]]+1
    6 // C[i] now contains the number of elements equal to i.
    7 for i = 1 to k
    8 C[i]= c[i]+C[i-1]
    9 // C[i] now contains the number of elements less than or equal to i.
    10 for j = A.length down to 1
    11 B[C[A[j]]] = A[j]
    12 C[A[j]]=C[A[j]]-1
  • It is not a comparison sort.
  • An important property of counting sort is that it is stable.number appears first in the input array appears first in the output array.

Radix sort

  • Intuitively, you might sort numbers on their most significant digit.Radix sort solves the problem of card sorting—counterintuitively—by sorting on the least significant digit first.
1
2
3
RADIX-SORT(A,d) 
1 for i = 1 to d
2 use a stable sort to sort array A on digit i
  • In order for radix sort to work correctly, the digit sorts must be stable
  • Lemma 8.3:Given -digit numbers in which each digit can take on up to possible values, RADIX-SORT correctly sorts these numbers in time if the stable sort it uses takes time.
  • When is constant and , we can make radix sort run in linear time.
  • Lemma 8.4:Given -bit numbers and any positive integer , RADIX-SORT correctly sorts these numbers in time if the stable sort it uses takestime for inputs in the range to .
  • The version of radix sort that uses counting sort as the intermediate stable sort does not sort in place, which many of the ‚-time comparison sorts do. Thus, when primary memory storage is at a premium, we might prefer an in-place algorithm such as quicksort.

Bucket sort

  • Bucket sort divides the interval into equal-sized subintervals, or buckets, and then distributes the n input numbers into the buckets.
  • Our code for bucket sort assumes that the input is an -element array and that each element in the array satisfies . The code requires an auxiliary array of linked lists (buckets).
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    BUCKET-SORT(A) 
    1 n= A.length
    2 let B[O..n- 1] be a new array
    3 for i = 0 to n-1
    4 make B[i] an empty list
    5 for i = 1 to n
    6 insert A[i] into list B[⎣nA[i]⎦]
    7 for i= 0 to n-1
    8 sort list B[i] with insertion sort
    9 concatenate the lists B[O], B[1],..., B[n - 1] together in order

let be the random variable denoting the number of elements placed in bucket  We now analyze the average-case running time of bucket sort.Taking expectations of both sides and using linearity of expectation, we have we define indicator random variables: Thus, we expand the square and regroup terms: Indicator random variable is 1 with probability and 0 otherwise, and therefore: When , the variables and are independent, and hence: Substituting these two expected values in equation , we obtain: we conclude that the average-case running time for bucket sort is: